Cyclone forecasting remains one of the most pressing challenges in meteorological science, where delayed or inaccurate predictions can result in significant loss of life and property. This work proposes an intelligent cyclone detection framework that applies supervised machine learning to structured atmospheric datasets for binary classification of weather conditions. Eight me- teorological indicators — sea surface temperature, atmospheric pressure, relative humidity, wind shear, vorticity, geographic latitude, ocean depth, and coastal proximity — form the input feature set. Four classifiers, namely K-Nearest Neighbors (KNN), Decision Tree (DT), Random Forest (RF), and Support Vector Machine (SVM), are independently trained and their outputs aggregated through a majority voting strategy to yield a fi- nal consensus-based prediction. Among these, RF records the strongest individual accuracy at 90.67%, followed by SVM at 90.17%, KNN at 89.67%, and DT at 81.67%. The accompanying web application, developed on the Streamlit framework, supports both batch CSV-based prediction and single-record manual input. Real-time atmospheric data for ten Indian coastal cities is retrieved through the OpenWeather API and rendered on an interactive Folium map with color-coded risk indicators. The overall system offers a low-cost, browser-accessible solution well- suited for meteorological departments, academic institutions, and regional disaster response units.
Introduction
This paper presents a machine learning-based cyclone detection and monitoring system designed to improve the speed and accessibility of storm prediction compared to traditional meteorological methods.
The motivation comes from the increasing severity of tropical cyclones, which are becoming more frequent and intense due to climate change. Traditional forecasting methods like Numerical Weather Prediction (NWP), satellite imaging, and radar systems are accurate but slow, expensive, and resource-intensive, making them less accessible for developing regions. In contrast, machine learning models can generate predictions almost instantly from historical data patterns.
The proposed system uses a dataset of 2000 meteorological records containing eight features such as sea surface temperature, pressure, humidity, wind shear, and coastal proximity. After preprocessing (cleaning missing values, removing duplicates, and normalizing features), four machine learning models are trained and compared: KNN, Decision Tree, Random Forest, and SVM. Among them, Random Forest and SVM perform best, achieving around 90% accuracy, while Decision Trees perform the weakest due to overfitting.
To improve reliability, the system uses a majority voting ensemble, combining predictions from all models to reduce individual errors and improve overall accuracy. This ensemble is especially useful for reducing false alarms.
A key feature of the system is its real-time monitoring dashboard built using Streamlit, which allows users to upload data, train models, compare performance, and view cyclone predictions visually. It also integrates the OpenWeather API to fetch live weather data for major Indian coastal cities and display cyclone risk levels on an interactive map using Folium.
Conclusion
This paper presented a browser-deployable cyclone detection system integrating four supervised classifiers under a majority voting ensemble, applied to a structured eight-feature meteorological dataset. Random Forest achieved the best individual accuracy of 90.67%, and the ensemble strategy provided additional robustness over any single model. The Streamlit interface makes the system accessible to non-specialist users, while OpenWeather API integration bridges the gap between offline model evaluation and real-time operational awareness. Several directions remain open for future work. Incorporating time-series models such as Long Short-Term Memory (LSTM) networks would allow the system to exploit temporal dependencies in successive atmospheric observations. Extend- ing the binary label to a multi-class cyclone intensity scale aligned with the Saffir-Simpson categories would increase utility for evacuation planning. Cloud deployment on AWS or Google Cloud Run would enable multi-user concurrent access. Finally, integrating satellite imagery as an additional input modality through convolutional feature extractors could sub- stantially improve detection of early-stage cyclone formation before surface-level indicators become significant.
References
[1] K. Srinivasan and R. Mehta, “Machine Learning Applications for Cy- clone Detection Using Environmental Features,” Int. J. Meteorological Applications, 2023.
[2] A. Patel and N. Deshmukh, “Real-Time Cyclone Forecasting with IoT and AI Integration,” in Proc. Int. Conf. Smart Systems and IoT, 2022.
[3] L. Chen and P. Roy, “Comparative Analysis of KNN and Naive Bayes for Storm Detection,” J. Computational Meteorology, 2021.
[4] M. Rao and S. Verma, “GUI-Based Environmental Risk Prediction Using Python,” Int. J. Software Engineering and Applications, 2020.
[5] T. Das and F. Khan, “AI Models for Natural Disaster Management: A Survey,” IEEE Access, vol. 7, pp. 12301–12315, 2019.
[6] F. Pedregosa et al., “Scikit-learn: Machine Learning in Python,” J. Machine Learning Research, vol. 12, pp. 2825–2830, 2011.
[7] T. Cover and P. Hart, “Nearest Neighbor Pattern Classification,” IEEE Trans. Inf. Theory, vol. 13, no. 1, pp. 21–27, 1967.
[8] J. R. Quinlan, “Improved Use of Continuous Attributes in C4.5,” J. Artificial Intelligence Research, vol. 4, pp. 77–90, 1996.
[9] L. Breiman, “Random Forests,” Machine Learning, vol. 45, no. 1, pp. 5–32, 2001.
[10] C. Cortes and V. Vapnik, “Support-Vector Networks,” Machine Learning, vol. 20, no. 3, pp. 273–297, 1995.
[11] National Oceanic and Atmospheric Administration, “National Hurricane Center Archive,” https://www.nhc.noaa.gov/
[12] Kaggle, “Cyclone and Storm Datasets,” https://www.kaggle.com/
[13] Indian Meteorological Department, “Cyclone Warning Services,” https://mausam.imd.gov.in/
[14] OpenWeather, “Current Weather Data API Documentation,” https:// openweathermap.org/api
[15] Streamlit, “Open-Source App Framework for Machine Learning,” https://docs.streamlit.io/
[16] Folium, “Python Data, Leaflet.js Maps,” https://python-visualization. github.io/folium/
[17] W. McKinney, “Data Structures for Statistical Computing in Python,” in Proc. 9th Python in Science Conf., 2010.
[18] C. R. Harris et al., “Array Programming with NumPy,” Nature, vol. 585, pp. 357–362, 2020.
[19] J. D. Hunter, “Matplotlib: A 2D Graphics Environment,” Computing in Science & Engineering, vol. 9, no. 3, pp. 90–95, 2007.